智能论文笔记

Speeding Up Action Recognition Using Dynamic Accumulation of Residuals in Compressed Domain

Ali Abdari , Pouria Amirjan , Azadeh Mansouri

分类：计算机视觉

2022-09-29

随着安装摄像头的广泛使用，基于视频的监视方法已引起了针对不同目的（例如辅助生活）的广泛关注。时间冗余和原始视频的巨大大小是与视频处理算法有关的两个最常见的问题。大多数现有方法主要集中于通过探索连续帧来提高准确性，这是费力的，不能考虑实时应用程序。由于视频主要以压缩格式存储和传输，因此在许多设备上都可以使用这些视频。压缩视频包含许多有益信息，例如运动向量和量化系数。正确使用此可用信息可以大大改善视频理解方法的性能。本文提出了一种使用残差数据的方法，该方法直接在压缩视频中可用，可以通过部分解码过程获得。此外，提出了一种积累相似残差的方法，该方法大大减少了处理识别的处理帧数。仅应用神经网络，专门用于压缩域中的累积残留物，可以加速性能，而分类结果与原始视频方法具有很高的竞争力。

translated by 谷歌翻译

A Labelled Sample Compression Scheme of Size at Most Quadratic in the VC Dimension

Farnam Mansouri , Sandra Zilles

分类：机器学习

2022-12-24

This paper presents a construction of a proper and stable labelled sample compression scheme of size $O(\VCD^2)$ for any finite concept class, where $\VCD$ denotes the Vapnik-Chervonenkis Dimension. The construction is based on a well-known model of machine teaching, referred to as recursive teaching dimension. This substantially improves on the currently best known bound on the size of sample compression schemes (due to Moran and Yehudayoff), which is exponential in $\VCD$. The long-standing open question whether the smallest size of a sample compression scheme is in $O(\VCD)$ remains unresolved, but our results show that research on machine teaching is a promising avenue for the study of this open problem. As further evidence of the strong connections between machine teaching and sample compression, we prove that the model of no-clash teaching, introduced by Kirkpatrick et al., can be used to define a non-trivial lower bound on the size of stable sample compression schemes.

translated by 谷歌翻译

Graph Learning for Anomaly Analytics: Algorithms, Applications, and Challenges

Jing Ren , Feng Xia , Azadeh Noori Hoshyar , Charu C. Aggarwal

分类：机器学习

2022-12-11

Anomaly analytics is a popular and vital task in various research contexts, which has been studied for several decades. At the same time, deep learning has shown its capacity in solving many graph-based tasks like, node classification, link prediction, and graph classification. Recently, many studies are extending graph learning models for solving anomaly analytics problems, resulting in beneficial advances in graph-based anomaly analytics techniques. In this survey, we provide a comprehensive overview of graph learning methods for anomaly analytics tasks. We classify them into four categories based on their model architectures, namely graph convolutional network (GCN), graph attention network (GAT), graph autoencoder (GAE), and other graph learning models. The differences between these methods are also compared in a systematic manner. Furthermore, we outline several graph-based anomaly analytics applications across various domains in the real world. Finally, we discuss five potential future research directions in this rapidly growing field.

translated by 谷歌翻译

TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering

Meryem Banu Cavlak , Gagandeep Singh , Mohammed Alser , Can Firtina , Joël Lindegger , Mohammad Sadrosadati , Nika Mansouri Ghiasi , Can Alkan , Onur Mutlu

分类：人工智能 | 机器学习

2022-12-09

Basecalling is an essential step in nanopore sequencing analysis where the raw signals of nanopore sequencers are converted into nucleotide sequences, i.e., reads. State-of-the-art basecallers employ complex deep learning models to achieve high basecalling accuracy. This makes basecalling computationally-inefficient and memory-hungry; bottlenecking the entire genome analysis pipeline. However, for many applications, the majority of reads do no match the reference genome of interest (i.e., target reference) and thus are discarded in later steps in the genomics pipeline, wasting the basecalling computation. To overcome this issue, we propose TargetCall, the first fast and widely-applicable pre-basecalling filter to eliminate the wasted computation in basecalling. TargetCall's key idea is to discard reads that will not match the target reference (i.e., off-target reads) prior to basecalling. TargetCall consists of two main components: (1) LightCall, a lightweight neural network basecaller that produces noisy reads; and (2) Similarity Check, which labels each of these noisy reads as on-target or off-target by matching them to the target reference. TargetCall filters out all off-target reads before basecalling; and the highly-accurate but slow basecalling is performed only on the raw signals whose noisy reads are labeled as on-target. Our thorough experimental evaluations using both real and simulated data show that TargetCall 1) improves the end-to-end basecalling performance of the state-of-the-art basecaller by 3.31x while maintaining high (98.88%) sensitivity in keeping on-target reads, 2) maintains high accuracy in downstream analysis, 3) precisely filters out up to 94.71% of off-target reads, and 4) achieves better performance, sensitivity, and generality compared to prior works. We freely open-source TargetCall at https://github.com/CMU-SAFARI/TargetCall.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Multi-Stage NMPC for a MAV based Collision Free Navigation under Varying Communication Delays

Andreas Papadimitriou , Hedyeh Jafari , Sina Sharif Mansouri , George Nikolakopoulos

分类：机器人

2022-08-07

通信网络中的时间延迟是通过边缘部署机器人的主要关注点之一。本文提出了一个多阶段的非线性模型预测控制（NMPC），该控制能够处理不同的网络引起的时间延迟，以建立控制框架，以确保无碰撞的无碰撞微型航空车（MAVS）导航。这项研究介绍了一种新颖的方法，该方法通过与现有的典型多阶段NMPC相反的离散化场景树来考虑不同的采样时间，在这种情况下，系统不确定性是由场景树建模的。此外，该方法根据通信链接中时间延迟的概率考虑了多阶段NMPC方案的自适应权重。由于多阶段NMPC，获得的最佳控制动作对于多个采样时间有效。最后，在各种测试和不同的模拟环境中证明了所提出的新型控制框架的总体有效性。

translated by 谷歌翻译

Reactive Navigation of an Unmanned Aerial Vehicle with Perception-based Obstacle Avoidance Constraints

Björn Lindqvist , Sina Sharif Mansouri , Jakub Haluška , George Nikolakopoulos

分类：机器人

2022-07-04

在本文中，我们提出了一种反应性约束导航方案，并避免了无人驾驶汽车（UAV）的嵌入式障碍物，以便在障碍物密集的环境中实现导航。拟议的导航体系结构基于非线性模型预测控制（NMPC），并利用板载2D激光雷达来检测障碍物并在线转换环境的关键几何信息为NMPC的参数约束，以限制可用位置空间的可用位置空间无人机。本文还重点介绍了所提出的反应导航方案的现实实施和实验验证，并将其应用于多个具有挑战性的实验室实验中，我们还与相关的反应性障碍物避免方法进行了比较。提出的方法中使用的求解器是优化引擎（开放）和近端平均牛顿进行最佳控制（PANOC）算法，其中采用了惩罚方法来正确考虑导航任务期间的障碍和输入约束。拟议的新颖方案允许快速解决方案，同时使用有限的车载计算能力，这是无人机的整体闭环性能的必需功能，并在多个实时场景中应用。内置障碍物避免和实时适用性的结合使所提出的反应性约束导航方案成为无人机的优雅框架，能够执行快速的非线性控制，本地路径计划和避免障碍物，所有框架都嵌入了控制层中。

translated by 谷歌翻译

PyChEst: a Python package for the consistent retrospective estimation of distributional changes in piece-wise stationary time series

Azadeh Khaleghi , Lukas Zierahn

分类： (统计)机器学习

2021-12-20

我们介绍了Pythest，一个Python包，它提供了同时估算了分布式静止时间序列的分布中多个转换点的工具。实现的非参数算法在一般框架中可被证明是一致的：当样本由未知的片断静止过程产生时。在该设置中，样本可以具有任意形式的远程依赖性，并且在变换点之前和之后的任何（未知）固定尺寸的有限尺寸边际的边缘依赖性可以是相同的。包装中包括的算法的强度在它们能够始终如一地检测变化，而不会强加在底层过程分布上的任何假设之外的任何假设。我们通过比较包装的性能与设计用于样本独立地和相同分布的设置的最先进模型来说明这种区别特征。

translated by 谷歌翻译

Extending the Unmixing methods to Multispectral Images

Jizhen Cai , Hermine Chatoux , Clotilde Boust , Alamin Mansouri

分类：计算机视觉

2021-11-23

在过去的几十年里，有关超光图像的密集有了密集的研究。诸如NMF，VCA和N-FindR等一些方法已成为标准，因为它们表明在处理超细图像的解密时的稳健性。然而，关于多光谱图像的混合物的研究相对稀缺。因此，我们将一些解密方法扩展到多光谱图像。在本文中，我们创建了两个模拟的多光谱数据集，其两个高光谱数据集被给出了其基本真理。然后我们将解密方法（VCA，NMF，N-FINDR）应用于这两个数据集。通过比较和分析结果，我们能够用多光谱数据集使用VCA，NMF和N-FindR的一些有趣的结果。此外，这也证明了将这些解密方法扩展到多光谱成像领域的可能性。

translated by 谷歌翻译

A Deep Reinforcement Learning Approach for Composing Moving IoT Services

Azadeh Ghari Neiat , Athman Bouguettaya , Mohammed Bahutair

分类：机器学习

2021-11-06

我们开发了一种新颖的框架，以有效，有效地发现众群服务，在一段时间内靠近用户近距离移动。我们介绍了一种移动的众包服务模型，其被建模为移动区域。我们提出了一种深度加强基于学习的学习的组合方法来选择和撰写考虑质量参数的移动物联网服务。此外，我们开发了一个平行的基于群体的服务发现算法作为衡量所提出的方法的准确性。两个现实世界数据集的实验验证了基于深度加强学习的方法的有效性和效率。

translated by 谷歌翻译